A Traceable Data Fusion Based on Data Provenance

نویسنده

  • Zhao Qiang
چکیده

Data fusion is a hot topic in data integration which at least includes the two stages: entity resolution and data conflict resolution. However, the existing fusion process is transparent and the fusion stages are isolated. So in this paper, we proposed a traceable data fusion mechanism based on data provenance which can trace the data sources of fusion results and the evolutionary process. The mechanism mainly targets forwards entity resolution and data conflict resolution stage. We represented the provenance of data origin using PI-CS which is more accurate because PI-CS can record the intermediate process of data evolution. In order to record the evolution process of data fusion, we proposed two transformation provenances: entity resolution provenance and data conflict resolution provenance which record respectively the evolution process of entity resolution and data conflict resolution. Finally, we give an example to validate the availability of the traceable mechanism for data fusion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Data Fusion Based on Provenance Information

Data fusion is the process of combining multiple representations of the same object, extracted from several external sources, into a single and clean representation. It is usually the last step of an integration process, which is executed after the schema matching and the entity identification steps. More specifically, data fusion aims at solving attribute value conflicts based on user-defined ...

متن کامل

Data provenance tracking as the basis for a biomedical virtual research environment

In complex data analyses it is increasingly important to capture information about the usage of data sets in addition to their preservation over time to ensure reproducibility of results, to verify the work of others and to ensure appropriate conditions data have been used for specific analyses. Scientific workflow based studies are beginning to realize the benefit of capturing this provenance ...

متن کامل

A New Method for Multisensor Data Fusion Based on Wavelet Transform in a Chemical Plant

This paper presents a new multi-sensor data fusion method based on the combination of wavelet transform (WT) and extended Kalman filter (EKF). Input data are first filtered by a wavelet transform via Daubechies wavelet “db4” functions and the filtered data are then fused based on variance weights in terms of minimum mean square error. The fused data are finally treated by extended Kalman filter...

متن کامل

Using Provenance to support Good Laboratory Practice in Grid Environments

Conducting experiments and documenting results is daily business of scientists. Good and traceable documentation enables other scientists to confirm procedures and results for increased credibility. Documentation and scientific conduct are regulated and termed as “good laboratory practice.” Laboratory notebooks are used to record each step in conducting an experiment and processing data. Origin...

متن کامل

Query Processing with Materialized Views in a Traceable P2P Record Exchange Framework

Materialized views which are derived from base relations and stored in the database are often used to speed up query processing. In this paper, we leverage them in a traceable peer-to-peer (P2P) record exchange framework which was proposed to ensure reliability among the exchanged data in P2P networks where duplicates and modifications of data occur independently in autonomous peers. In our pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015